How to aggregate software metrics?

نویسندگان

  • Bogdan Vasilescu
  • Alexander Serebrenik
  • Mark van den Brand
چکیده

Maintaining a software system resembles renovating a house: it usually takes longer and costs more than planned. Similarly to a house owner identifying potential problems before renovation, a software owner should assess maintainability of software before renovating or extending it. To measure maintainability one often applies metrics, associating software artifacts with numbers. Unfortunately, metrics are commonly measured at method or class level, and fail to provide an adequate picture of the entire system maintainability. Continuing the analogy, metrics detail the state of every brick but obscure the assessment in the multitude of details. To see the forest of a software system for the trees of individual measurements, one uses aggregation techniques such as the mean, median, sum, or, recently, Gini, Theil, Kolm, Atkinson, and Hoover indices. A formal comparison of these techniques has been missing until now. We present an extensive correlation study of the aforementioned techniques, applied to size (e.g., number of lines of code, semicolons, or statements) and complexity (e.g., percentage of branching statements, depth of inheritance tree, or number of children) metrics. We conducted an empirical evaluation on the 106 open source Java systems comprising the Qualitas Corpus. We observed, e.g., that size and complexity metrics aggregated by Gini, Theil, Hoover, and Atkinson strongly correlate, while mean and Kolm correlate on size but not on complexity metrics [1]. Based on our study a software owner can chose appropriate aggregation technique depending on, e.g., presence of negative values, or relative importance of high/low values. BODY When aggregating metrics to assess software maintainability, Gini, Theil, Hoover, and Atkinson strongly correlate.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Can Fuzzy Mathematics enrich the Assessment of Software Maintainability?

Software maintainability depends both on qualitative and quantitative data. Existing maintainability models aggregate data into hierarchies of characteristics with given dependencies. However, data used to score the characteristics can be uncertain or even completely unknown. Therefore, it would be meaningful to evaluate sensitivity of the aggregated result, i.e. the maintainability, with respe...

متن کامل

Metrics and Evaluation Tools for Patient Engagement in Healthcare Organization- and System-Level Decision-Making: A Systematic Review

Background Patient, public, consumer, and community (P2C2) engagement in organization-, community-, and systemlevel healthcare decision-making is increasing globally, but its formal evaluation remains challenging. To define a taxonomy of possible P2C2 engagement metrics and compare existing evaluation tools against this taxonomy, we conducted a systematic review.   Methods A broad search strate...

متن کامل

Getting the right ‘scale’ in tool based structural fitness measurement

How do we ensure a predictable level of internal code quality that reconciles with the current scale and context of software engineering characterized by very large applications, geographically dispersed teams and tight deadlines? We do not have enough experts to scale our code review method to the practical context of software development. We introduce structural fitness as one aspect of softw...

متن کامل

A Page-Classification Approach to Web Usage Semantic Analysis

With the emergence of the World Wide Web, analyzing and improving Web communication has become essential to adapt the Web content to the visitors’ expectations. Web communication analysis is traditionally performed by Web analytics software, which produce long lists of page-based audience metrics. These results suffer from page synonymy, page polysemy, page temporality, and page volatility. In ...

متن کامل

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • TinyToCS

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2012